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(54) Fast synchronization for programs written in the Java programming language 



(57) A method, system, and computer program 
product for synchronized thread execution in a multi- 
threaded processor are described. Each synchronized 
thread refers to at least one object (702, 703. 705) iden- 
tified by an object identification (OID) that is shared 
among a plurality of synchronized threads. One of the 



synchronized threads is selected for execution. Upon 
entering the selected thread, an entry sequence indi- 
cates that the shared object (703) should be locked by 
pushing its OID onto a lock stack (701). The operations 
defined by the selected thread are executed and the in- 
dication is removed by pushing the OID from the lock 
stack. 
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Description 

1. Field of the Invention. 



• 00 .°.w. , ThS PreSert inven,ion rela,es - in 9 eneral > t0 data Processing, and. more particularly, to thread synchronization 
in JAVA language programs. 



2. Relevant Background. 



[0002] The JAVA™ (a trademark of Sun Microsystems. Inc.) programming language, is an object-oriented program- 
ming language developed by Sun Microsystems. Inc.. the Assignee of the present invention. The JAVA programming 
language has found success as a language and programming environment for networked applications The JAVA 
programming language is attractive due to its many features, including standardized support for concurrent execution 
of program threads. The JAVA programming language's concurrency features are provided at both a lanquaqe (syn- 
tactic) level and through a threads' library. At the language level, an object's methods can be declared "synchronized" 
Methods within a class that are declared synchronized do not run concurrently and run under control of "monitors' to 
ensure that variables remain in a consistent state. 

[0003] Each time a synchronized method is entered or exited, the JAVA language requires calls to the operating 
system (O/S) kernel to allocate thread synchronization resources. Calls to the kernel may require tens if not hundreds 
of instructions depending on the O/S in use. In comparison, the synchronized method itself may require only a few 
lines of code. As an example, a dictionary hash table method can be implemented with fewer than ten instructions but 
o implement it as a synchronized melhod requires more than 100 instruction in a typical operating system Hence 
thread synchronization significantly adds to the execution time of many programs. 

[0004] This overhead is required in programs that make heavy use of multi-threading and depend on thread syn- 
chronizatron. However, the overhead is undesirable in programs that are single-threaded. Similarly, even in multi- 
threaded programs, a large number of Ihe threads may in fact execute correctly without the synchronization overhead 
Hence, a need exists for a thread synchronization mechanism that only incurs the overhead associated with O/S thread 
management resource allocation only when those resources are needed. 

[0005] Operating systems conventional enable multithreading in one of two ways: preemptable and non-preempt- 
able^ A preemptable thread operating system (e.g.. Solaris and Windows/NT) include O/S techniques and devices that 
enable one thread to interrupt another concurrently executing thread. Hence, at any given time, an executing thread 
cannot predict whether it will continue to execute or whether it will be blocked by another thread. Hence, the application 
cannot manage thread synchronization on its own because it lacks visibility as to when threads will be blocked Preempt- 
able threads are also valuable in multiprocessing machines to efficiently distribute execution of threads across multiple 
processors. 

[0006] Non-preemptable multithreading is an simpler form of multithreading that supports a mode of thread execution 
whereby once a thread begins to execute, it cannot be blocked by another thread. A thread may halt or block itself' 
yield control to other threads, or be blocked by virtue of waiting for input/output (I/O) to complete. There remain a large 
number of applications that can be implemented as single threads and which do not require the preemptive multithread- 
ing features of an operating system. Non-preemptive operating systems will likely exist in information appliances and 
simpler operating systems for some time. Where the O/S ensures that each thread cannot be preempted, allocation 
nJS^T synchronization resources in the O/S increases program execution time with little benefit 
[0007] Multithreading takes advantage of parallelism inherent in (or designed into) many programs. However legacy 
programs often exhibrt little parallelism. Moreover, some programs by the nature of their behavior do not exhibit a high 
degree of parallelism. These programs are slowed by having to incur the overhead associated with muftithreadinq 
operating systems without reaping any benefits because of their non-parallel structure. Hence, a need exists tor a 
thread synchronization mechanism that speeds up execution in programs that are essentially un4hreaded yet runninq 
on a multithreading operating system. 

50 SUMMARY OF THE INVENTION 

[0008] Briefly stated, the present invention involves a method for synchronized thread execution in a multithreaded 
processor. Each synchronized thread refers to at least one object identified by an object identification (OID) that is 
snared among a pluralrty of synchronized threads. One of the synchronized threads is selected for execution Upon 
entering the selected thread, an entry sequence indicates that the shared object should be locked by pushing its OID 

Z St3Ck ' 1116 °P erations def,ned b V the selected thread are executed and the indication is removed by popping 

the OID from the lock stack. a 

[0009] The foregoing and other features, utilities and advantages of the invention will be apparent from the following 
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more particular description of preferred embodiments of the invention as illustrated in the accompany drawings. Still 
other embodiments of the present invention will become readily apparent to those skilled in the art from the following 
detailed description, wherein is shown and described only the embodiments of the invention by way of illustration of 
the best modes contemplated for carrying out the invention. As will be appreciated, the invention is capable of other 
5 and different embodiments and several of its details are capable of modification in varbus obvious respects, all without 
departing from the spirit and scope of the present invention. Accordingly, the drawings and detailed description are to 
be regarded as illustrative in nature and not as restrictive. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 

[0010] 

FIG. 1 shows computer system implementing the procedure and apparatus in accordance with the present inven- 
tion; 

FIG. 2 shows a thread execution timeline of a first example thread; 
FIG. 3 shows a thread execution timeline of a second example situation; 
20 FIG. 4 shows a thread execution timeline of a third example situation; 

FIG. 5 shows a thread execution timeline of a fourth example situation; 

FIG. 6 shows an exemplary structure for implementing the synch ronization method in accordance with the present 
25 invention; and 

FIG. 7 shows data structures useful in implementation of the synchronization method in accordance with the present 
invention. 

30 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

s 

[0011] In general, the present invention involves a methodfor handling synchronized threads in programs written in 
the JAVA programming language that postpones initiating the high-overhead thread synchronization structures until it 
is determined that they are needed to ensure program correctness. In other words, in many applications, particularly 
35 single-threaded programs, the application itself ensures that the programs are deterministic, and the high overhead 
associated with multi-threaded synchronization serves only to slow program execution. Accordingly, in accordance 
with the present invention implementation of complex synchronization mechanisms including full data structures is 
postponed such that they may in fact never be implemented. 

[0012] Although the problems of thread synchronization are not unique to the JAVA programming language, some 
*o features of the JAVA programming language affect the problem. JAVA programming language synchronization involved 
an O/S assigned lock and monitor object that were created each time a program entered a method labeled •synchro- 
nized". The monitor object required data structures that were quite large, including two queue headers, a counter, and 
an owner field. The size of these structures prohibits them being included in a standard object header. The present 
invention addresses this difficulty by avoiding the creation of these lock and monitor objects until they are necessary 
<5 [0013] The present invention is described in terms of a JAVA programming environment implementation such as a 
JAVA virtual machine (JVM), just-in-time (JIT) compiler, or a compile time or run time mechanism that converts JAVA 
programming language code into another programming language. However, the present invention is useful in any 
programming language that enables statements or program objects to access multithreading features of an operating 
system. The present invention does not require changes to the programming language itself, but rather provides method 
50 for implementing a program such that the program is not forced to perform unnecessary O/S operations to support 
thread synchronization. 

[0014] FIG. 1 illustrates in block diagram form a computer system incorporating an apparatus and system in accord- 
ance with the present invention. Processor architectures and computing systems are usefully represented as a collec- 
tion of interacting functional units as shown in FIG. 1 . These functional units perform the functions of fetching instructions 
55 and data from memory, processing fetched instructions, managing memory transactions, interfacing with external I/O 
and displaying information. 

[0015] FIG. 1 shows a typical general purpose computer system 100 incorporating a processor 102 and using both 
an application program and an operating system executing in processor 102. Computer system 100 in accordance 
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with the present invention comprises an system bus 101 for communir^tinr. mu^:. 

bus 101 through input/output (./O) devices within p rcces^S TSsor ^Z?X22EZZ HZ 
using a memory bus 103 to store information and instructions SnS^wS^S^^ST' * V ? 
exam P .e. one or more .eve.s of cache memory and main m^T^^H S^IK 

JJS^^iSS.!! 6 T C0U ? d f t0 bUS 101 ^ 3re ° perat,Ve to c °^nicate information in appropriately 
structured form to and from the other parts of computer 1 00. User I/O devices may include a kevboard mo. ,« ^Z. 
or tape reader, optical disk, or other available I/O devices, including a^^^^ ^^o^^^ 
coupled to bus 101 and may be imp.emented using one or more magnetic hSS? n^gnetSs roL« 
large banks of random access memory, or the like A wide variety of random JnH ? ^ S * 

preemptive and non-preemptive threads on a task-by4ask basis. operating systems that handle both 

[0018] A primary method for thread synchronization in many software languages (includina JAVA nroaramminn .»„ 

example, here*), n, lock operation will pee,*,, l„„ha, , htca d action J5Z*i££J%££ a£5Uj 

iSL'tr"- 9,6 l ° ek ope ™ 0on comp ' ,,es - *•«-»— h E5t^E£l££5£S£; 

of FIG. 2. there is not. in general, a needto create associated monitor data structures when an obiect is hZ i iSIZ? 

SZ^SETZZ be runnin9, but none can preempt the execution of 

noi generally needed. In these circumstances, the monitor data structures will oniv h* naA n^ «^ u *- lu,e5 > 
thread is blocked for some time in a manner that is not known to* hTappTcatS S2t ^ eX6CU,in9 

ease « Mm > . ibloeking oparalion ia um b, a sales ot oparatlon boxes to mSToZS^^S 

by exaoubng each synchronized method h a construct c«l.a.^m,tS»^^2 ™ ' 

[002S] FIG. 3 ■ FIG. 5 Illustrates various situation, during axecution ol synchronized threads labeled thzeadOand 
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thread 1 and use the same nomenclature and graphical representations described above in reference to FIG. 2. FIG. 
3 shows a situation in which thread 0, executes a number of operations, at least one of which accesses a shared object 
(not shown). Thread 0 is synchronized with thread 1 in FIG. 3 because thread 1. for example, accesses the shared 
object also. Thread 0 and thread 1 may be related as producer-consumer (i.e., one thread changes the shared object 
5 and the other thread uses the changed object) or the shared object may support only one access at a time and so 
thread 0 and thread 1 are synchronized to avoid conflict. 

[0026] The lock and unlock operations in each thread are performed at any time during the thread execution upon 
entering and exiting a critical section of the thread (i.e., a method within a thread containing operations that access 
the shared object). A thread may execute more than one lock operation before executing an unlock operation (as 

io shown in FIG. 2), but each lock must be followed by an unlock before the shared object is available to another thread. 
The programming language may allow explicit object locking/unlocking. Alternatively, in the case of the JAVA program- 
ming language a "synchronized" statement in the method implies the locking/unlocking operations. 
[0027] The lock operation is implemented in accordance with the present invention by pushing an object identification 
(OID) onto a special purpose stack (701 in FIG. 7). In the particular examples herein, the OID is an address of the 

is shared object "x" in memory, although any OID that uniquely identifies a locked object or object is equivalent to the 
specific implementation. In contrast with the prior art, the lock operation does not cause the OS to allocate resources 
or create data structures to provide thread synch ronization. A thread may acquire multiple locks in this manner, although 
only a single lock is shown for simplicity. Moreover, the thread may acquire multiple locks on the same object such that 
each acquired lock must be unlocked before the object becomes available for another thread. 

20 [0028] Only when there is an actual monitor object does there need to be any appreciable code executed. This may 
block the executing thread, if the lock is already granted. A similar procedure may be executed at unlock time resuming 
a thread. Remembering that the existence of a monitor object is the exception rather than the rule, the entry sequence 
into a synchronized method is logically: 

25 

if (monitorExists (o) ) { 

monitorLockProcessing (o) 
) else { 

pushOnStack (o) ; 

) 



30 



35 



and an exemplary exit sequence is: 

40 

if (monitorExists (o) ) ( 

monitorUnlockProcessing (o) 

45 

) else { 

popStack ( ) ; 

so ) 

where monitorLockProcessing and monitorUnlockProcessing represent methods that call conventional lock/unlock 
operations. 

[0029] It is important to note that the synchronization technique of present invention does not impose ordering con- 
55 straints on the execution of thread 0 and thread 1 . In other words, thread 1 may execute before thread 0 in any of the 
examples, and other mechanisms must be employed to constrain order. Examples of these mechanisms in the JAVA 
programming language include the wait(), notify() and notifyAIIO methods available in the JAVA programming language 
Threads class, although equivalents are available in other program languages. 



5 



EP 0 955 584 A2 

[0030] 



10 



15 



20 



25 



35 



40 



45 



SO 



55 



[0030] In the example of FIG. 3 f thread 1 and thread 0 are not CPU hounH *u ♦ *w 

CPU cycles as compared to the number of operations that mus IT ^J™"* 

thread conflict with each other during their execution eve houj S^vSS IT ? 

synchronized. The number of clockcyc.es required to execrte mc^ 
except.cn of memory operations, operations that require inpuVoutput (I/O) aMb^SS^S^T t ! 

-« »«« and *. ,»». As shown in FK3. 3, tni*. JLn^^S^"^ , ^-S 

Lr^^^^ ' n ■? P ° ° /S thr6ad s ^ ch ^^tion resources are they 

a e nevertheless allocated in the prior art). A significant feature of the present invention is that the 

without a call to the OS by simply pushing an OID onto stack 70! . In accoidSS ^tSS^SSSTE 

UHS . e r Ce> " 1,16 S6COnd embodiment a '«* is never implemented in the example of FIG 4 because 

thread 0 resumes before any other thread has created a lock. P or MB " 4 because 

Jo K^r 1 " FIG ' f ShOW t ^ SXamp,e indUdin9 thread 0 that b,ocks at °P e ^ion 8 during its execution In 
FIG. S, Ihe blocking operation in thread 0 resumes at operation 18 At oDeratior. in thrown 1 ,„ki l ! 

33512?==* »^»t.=£K=swjK: 

engine 604. The executable instructions may include calls to OS eofi or m a v, . ! . executGd b ^ execu tion 

.n.erpre.er 602 may be implemented in £*^2XEr^ 

or operable at run time such as a just-in-time compiler prodded vll?jA^^ ^ ^ 
interpret ^ 
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execution by execution engine 604 at run time. 

[0038] Referring to FIG. 7, a plurality ol objects 702. 703, and 705 are shown. Objects 703 are locked by a corre- 
sponding entry in stack 701 . In contrast, object 705 is locked by a monitor object 704 created in a convent ionaf fash ion. 
The lact that an object is locked and the number of times it is locked is recorded by execution engine 604 (shown in 
FIG. 6) by pushing the address of the object (indicated as object ID or OID in FIG. 7) onto stack 701 at lock time. The 
entry is popped from stack 701 at unlock time and the object associated with the popped OID becomes an unlocked 
object 702. In this way, if the thread does not block while an object is locked, there is nothing more than needs to be 
doneto ensure thread synchronization. If, however, the thread does blockthen the objects on stack 701 can be allocated 
monitor data structures 704 that are then used in a traditional way and stack 701 emptied. 

[0039] As noted above, on the rare occasions that real lock/monitor data structures are created they can be imple- 
mented without the involvement of operating system 606. Memory 107 is allocated into application or user memory 
space 701 and kernel or OS memory space 702. 

[0040] Although the invention has been described and illustrated with a certain degree of particularity, it is understood 
that the present disclosure has been made only by way of example, and that numerous changes in the combination 
and arrangement of parts can be resorted to by those skilled in the art without departing from the spirit and scope of 
the invention, as hereinafter claimed. 



Claims 

1. A method for execution in a processor having a plurality of threads executing thereon, the threads including syn- 
chronized operations that refer to at least one shared object, wherein the shared object is identified by an object 
identification (OID), the method comprising the steps of: 

selecting a first thread of the plurality of threads including a synchronized operation for execution; 

upon entering the selected thread, indicating that the at least one shared object should be locked by pushing 

the OID of the at least one shared object onto a lock stack; 

executing the synchronized operations defined by the selected thread; and 

removing the indication by pushing the OID from the lock stack. 

2. The method of claim 1 wherein during the step of executing the operations defined by the selected thread the 
selected thread blocks, and the method further comprises a step of locking the at least one shared object after the 
selected thread blocks. 

3. The method of claim 2 further comprising the steps of creating an instance of a monitor object corresponding to 
the shared object. 

4. The method of claim 3 wherein the monitor object comprises a queue header, a counter, and an owner field iden- 
tifying the selected thread. 

5. The method of claim 3 wherein the step of locking comprises creating the monitor object in application memory 
space. 

6. The.method of claim 4 wherein the step of locking comprises creating the monitor object in kernel memory space. 

7. The method of claim 2 wherein the step of locking further comprises: 

determining when the selected thread blocks; and 

performing the locking operation in response to the determining step. 

8. The method of claim 2 wherein the step of locking further comprises: 

determining when the selected thread resumes after the block; and 

performing the locking operation only if one other of the threads including a synchronized operation has blocked 
since the selected thread blocked. 

9. The method of claim 1 further comprising the steps of: 
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selecting a second thread for execution, wherein the second thread includes an operation synchronized with 
the synchronized operation of the first thread; 

upon entering the second thread, determining whether the shared object is locked. 

1 0. The method of claim 9 wherein when it is determined that the shared object is locked the method further comprises 
the steps of creating a monitor object associated with the shared object to determine when the shared object is 
unlocked; and 1 

executing the operations defined by the second thread after it is determined that the shared object is unlocked. 

11. The method of claim 9 wherein when it is determined that the shared object is unlocked the method further com- 
prises the steps of indicating that the shared object should be locked by pushing its OID onto the lock stack; and 

executing the operations defined by the second thread; and 
removing the indication by pushing the OID from the lock stack. 

1 2. A computer system for executing an application comprising a plurality of synchronized threads of execution where- 
in each synchronized thread refers to at least one object identified by an object identification (OID) that is shared 
among a plurality of synchronized threads, the computer system comprising: 

a processor; 

a memory coupled to the processor; 

a multithreading operating system that supports multiple threads of execution in a shared address space of 
the memory; H 

a lock stack in the memory, the lock stack comprising a plurality of entries sized to hold an OID' 
an instruction interpreter executing in the processor and coupled to receive a selected one synchronized thread 
and cause the selected thread to execute on the processor, wherein upon entering the selected thread the 
instruction interpreter indicates that the shared object should be locked by pushing its OID onto a lock stack 
and upon exiting the selected thread the instruction interpreter removes the indication by pushing the OID 
from the lock stack. 

1 3. The computer system of claim 12 further comprising: 

a thread block indicator operating within the processor to signal when the selected thread blocks during exe- 
cution; and 

object locking devices operating within the processor and responsive to the thread block indicator to lock the 
shared object identified by the OID in the lock stack. 

14. The computer system of claim 12 wherein the object locking devices comprise an instance of a monitor object 
corresponding to the shared object wherein the monitor object includes a queue header, a counter, and an owner 
tield identifying the selected thread. 

1 5. The computer system of claim 14 wherein the monitor object is instantiated in an application memory space of the 
memory. 

16. The computer system of claim 14 wherein the monitor object is instantiated in a kernel memory space of the 
memory. r 

17. The computer system of claim 14 further comprising: 

a thread release indicator operating within the processor to signal when the selected thread releases from a 
block condition during execution; and 

object locking devices operating within the processor and responsive to the thread block indicator to lock the 
shared object identified by the OID in the lock stack only if another of the synchronized threads has blocked 
since the selected thread blocked. 

18. A computer program product comprising: 

a computer usable medium having computer readable code embodied therein for synchronized thread exe- 
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cution in a multithreaded processor, wherein each synchronized thread refers to at least one object identified 
by an object identification (OID) that is shared among a plurality of synchronized threads, the computer program 
product comprising: 

computer program devices operating in the computer system and configured to cause a computer to select 
5 one of the synchronized threads for execution: 

computer program devices operating in the computer system and configured to cause a computer to indicate 

that the shared object should be locked by pushing its Ol D onto a lock stack upon entering the selected thread; 

computer program devices operating in the computer system and configured to cause a computer to execute 

the operations defined by the selected thread; and 
10 computer program devices operating in the computer system and configured to cause a computer to remove 

the indication by pushing the OID from the lock stack. 

19. The computer program product of claim 1 8 further comprising: 

computer program devices configured to cause a computer to lock the shared object after the selected thread 
is blocks. 

20. The computer program product of claim 18 further comprising: 

computer program devices configured to cause a computer to create an instance of a monitor object corre- 
sponding to the shared object wherein the monitor object comprises a queue header, a counter, and an owner field 
20 identifying the selected thread. 

21. The computer program product of claim 19 further comprising: 

computer program devices configured to cause a computer to determine when the selected thread blocks; and 
2S computer program devices configured to cause a computer to perform the locking operation in response to 

the determining step. 

22. The computer program product of claim 1 9 further comprising: 

30 computer program devices configured to cause a computer to determine when the selected thread resumes 

after the block; and 

computer program devices configured to cause a computer to perform the locking operation only if another of 
the synchronized threads has blocked since the selected thread blocked. 

3$ 23. The computer program product of claim 1 8 further comprising: 

computer program devices configured to cause a computer to create a monitor object associated with a shared 
object, the shared object being identified by an object identifier (OID), the monitor object comprising state 
information and methods to determine when the shared object is unlocked if the shared object is locked;; 
^0 computer program devices configured to cause a computer to determine whether the shared object is locked 

upon entering the second thread; 

computer program devices configured to cause a computer to create a monitor object associated with the 
shared object to determine when the shared object is unlocked if the shared object is locked; 
computer program devices configured to cause a computer to indicate that the shared object should be locked 
*s by pushing its OID onto the lock stack if the shared object is unlocked; 

computer program devices configured to cause a computer to execute the operations defined by the second 
thread; and 

computer program devices configured to cause a computer to remove the indication by pushing the OID from 
the lock stack. 



so 



24. A computer data signal embodied in a carrier wave comprising: 



a first code portion comprising code configured to cause a computer to create a monitor object associated 
with a shared object, the shared object being identified by an object identifier (OID), the monitor object com- 
55 prising state information and methods to determine when the shared object is unlocked if the shared object is 

locked; 

a second code portion comprising code configured to cause a computer to indicate that the shared object 
should be locked by pushing its OID onto the lock stack if the shared object is unlocked; 



OMCrVYNfv -CO nOCCCO/IAO • - 



9 



EP 0 955 584 A2 

s^ d dTrL P d^d C ° mPriSin9 "** C ° nfi9Ured 10 C3USe 3 COmpUt6r t0 6XeCUte the °P erations ^ the 

ToTo fSrrSSS.^ 8 C ° de C ° nfi9Ured 10 — 3 C ° mpU,er * ™™ »• ^cation by pushing 



70 



75 



20 



25 



30 



35 



40 



45 



SO 



55 



BNSDOCID: <FP n«^^ 5 , „ 



10 



EP 0 955 584 A2 




11 



EP 0 955 584 A2 




12 



BMOIWin- ,co 



EP 0 955 584 A2 




13 



EP 0 955 584 A2 




14 




11 



EP0 95SS64A2 




12 



EP 0 955 564 A2 




13 



EP0 955 SS4A2 




